High Performance Linear Transform Program Generation for the Cell BE
نویسندگان
چکیده
The Cell BE is among a new generation of multicore processors including the Intel Larrabee and the Tilera TILE64 that provide an impressive peak fixed or floating point performance for scientific, signal processing, visualization, and other engineering applications. As shown in Fig. 1, the Cell uses simple in-order cores designed specifically for numerical computing, and requires explicit memory management to achieve maximal performance, which make programming and optimizing a challenge. In this paper, we extend Spiral [7], a program generation system, to generate highly optimized linear transform programs for the Cell BE. In doing so, as presented in [2], we extend Spiral’s architectural paradigms to include support for distributed memory architectures like the Cell that allow hiding memory costs using multibuffering techniques. We focus on fixed-size code for the 1D complex discrete Fourier transform (DFT), but also generate code for variants including transforms that work on real input, 2D input, and for other transforms including the discrete Cosine and Sine transforms. We generate code for various usage scenarios, including latency optimized and throughput optimized code, and our system can handle various complex data formats and data distribution formats. The performance of Spiral generated code for the Cell is comparable to, and in many cases better than existing implementations, where available. Spiral. Spiral automates the generation of platform adapted high-performance libraries with a focus on the domain of linear transforms. Spiral provides a range of functionality difficult to match with hand written libraries, with generated programs comparing well against the performance of handoptimized code. Spiral uses a domain-specific, declarative, mathematical language to both represent algorithms, and to model the architecture at a high level. It uses rewriting to transform algorithms at a high level of abstraction to “fit” the target architecture.
منابع مشابه
FFT Program Generation for the Cell BE
The complexity of the Cell BE’s architecture makes it difficult and time consuming to develop multithreaded, vectorized, high-performance numerical libraries. Our approach to solving this problem is to use Spiral, a program generation system, to automatically generate and optimize linear transform libraries for the Cell. To extend the Spiral framework to support the Cell architecture, we first ...
متن کاملTITLE Spiral
Spiral is a program generation system (software that generates other software) for linear transforms and an increasing list of other mathematical functions. The goal of Spiral is to automate the development and porting of performance libraries. Linear transforms include the discrete Fourier transform (DFT), discrete cosine transforms, convolution, and the discrete wavelet transform. The input t...
متن کاملElectric Power Generation with Reverse Electrodialysis
The computer simulation program of a practical scale reverse electrodialysis process has been developed based on the program for saline water electrodialysis. The program is applied to compute the performance of an industrial-scale reverse electrodialysis stack (effective membrane area S = 1 m × 1 m = 1 m2, cell pair number N = 300 pairs). The stack operatingconditions are optimized. Seaw...
متن کاملCombined Use of Sensitivity Analysis and Hybrid Wavelet-PSO- ANFIS to Improve Dynamic Performance of DFIG-Based Wind Generation
In the past few decades, increasing growth of wind power plants causes different problems for the power quality in the grid. Normal and transient impacts of these units on the power grid clearly indicate the need to improve the quality of the electricity generated by them in the design of such systems. Improving the efficiency of the large-scale wind system is dependent on the control parameter...
متن کاملThe effect of vertical injection of reactants to the membrane electrode assembly on the performance of a PEM fuel cell
In order to present a new and high performance structure of PEM fuel cell and study the influence of the flow direction and distribution on the rate of reactants diffusion, three novel models of vertical reactant flow injection into the anode and cathode reaction area field have been introduced. They consist of one inlet and two inlets and also a continuous channel. The governing equations on t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009